11 research outputs found

    A foundation for reliable spatial proteomics data analysis.

    Get PDF
    Quantitative mass-spectrometry-based spatial proteomics involves elaborate, expensive, and time-consuming experimental procedures, and considerable effort is invested in the generation of such data. Multiple research groups have described a variety of approaches for establishing high-quality proteome-wide datasets. However, data analysis is as critical as data production for reliable and insightful biological interpretation, and no consistent and robust solutions have been offered to the community so far. Here, we introduce the requirements for rigorous spatial proteomics data analysis, as well as the statistical machine learning methodologies needed to address them, including supervised and semi-supervised machine learning, clustering, and novelty detection. We present freely available software solutions that implement innovative state-of-the-art analysis pipelines and illustrate the use of these tools through several case studies involving multiple organisms, experimental designs, mass spectrometry platforms, and quantitation techniques. We also propose sound analysis strategies for identifying dynamic changes in subcellular localization by comparing and contrasting data describing different biological conditions. We conclude by discussing future needs and developments in spatial proteomics data analysis..G., C.M.M., and M.F. were supported by the European Union 7th Framework Program (PRIME-XS Project, Grant No. 262067). L.M.B. was supported by a BBSRC Tools and Resources Development Fund (Award No. BB/K00137X/1). T.B. was supported by the Proteomics French Infrastructure (ProFI, ANR-10-INBS-08). A.C. was supported by BBSRC Grant No. BB/D526088/1. A.J.G. was supported by BBSRC Grant No. BB/E024777/ and a generous gift from King Abdullah University for Science and Technology, Saudi Arabia. D.J.N.H. was supported by a BBSRC CASE studentship (BB/I016147/1)

    Identifying the science and technology dimensions of emerging public policy issues through horizon scanning

    Get PDF
    Public policy requires public support, which in turn implies a need to enable the public not just to understand policy but also to be engaged in its development. Where complex science and technology issues are involved in policy making, this takes time, so it is important to identify emerging issues of this type and prepare engagement plans. In our horizon scanning exercise, we used a modified Delphi technique [1]. A wide group of people with interests in the science and policy interface (drawn from policy makers, policy adviser, practitioners, the private sector and academics) elicited a long list of emergent policy issues in which science and technology would feature strongly and which would also necessitate public engagement as policies are developed. This was then refined to a short list of top priorities for policy makers. Thirty issues were identified within broad areas of business and technology; energy and environment; government, politics and education; health, healthcare, population and aging; information, communication, infrastructure and transport; and public safety and national security.Public policy requires public support, which in turn implies a need to enable the public not just to understand policy but also to be engaged in its development. Where complex science and technology issues are involved in policy making, this takes time, so it is important to identify emerging issues of this type and prepare engagement plans. In our horizon scanning exercise, we used a modified Delphi technique [1]. A wide group of people with interests in the science and policy interface (drawn from policy makers, policy adviser, practitioners, the private sector and academics) elicited a long list of emergent policy issues in which science and technology would feature strongly and which would also necessitate public engagement as policies are developed. This was then refined to a short list of top priorities for policy makers. Thirty issues were identified within broad areas of business and technology; energy and environment; government, politics and education; health, healthcare, population and aging; information, communication, infrastructure and transport; and public safety and national security

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Detection of Functional Overreaching in Endurance Athletes Using Proteomics

    No full text
    No reliable biomarkers exist to identify athletes in various training states including functional overreaching (FOR), non-functional overreaching (NFOR), and overtraining syndrome (OTS). Participants (N = 10, age 38.3 ± 3.4 years) served as their own controls and in random, counterbalanced order either ran/cycled 2.5 h (70.0 ± 3.7% VO2max) three days in a row (FOR) or sat in the lab (rest) (separated by three weeks; 7:00–9:30 am, overnight fasted state). Participants provided fingerprick samples for dried blood spot samples (DBS) pre- and post-exercise/rest, and then during two recovery days. DBS proteins were measured with nanoLC-MS in data-independent acquisition (DIA) mode, and 593 proteins were identified and quantified. Proteins were considered for the FOR cluster if they were elevated during one of the two recovery days but not more than one of the exercise days (compared to rest). The generalized estimating equation (GEE) was used to identify proteins linked to FOR. A total of 13 proteins was linked to FOR and most were associated with the acute phase response and innate immune system activation. This study used a system-wide proteomics approach to define a targeted panel of blood proteins related to FOR that could form the basis of future NFOR- and OTS-based studies

    Proteomic Profiling and Monitoring of Training Distress and Illness in University Swimmers During a 25-Week Competitive Season

    Get PDF
    Purpose: To evaluate relationships of proteomics data, athlete-reported illness, athlete training distress (TDS), and coaches' ratings of distress and performance over the course of the competitive season. Methods: Thirty-five NCAA Division II swimmers were recruited to the study (male n = 19, female n = 16; age 19.1 ± 1.6 years). Athletes provided fingerprick dried blood spot (DBS) samples, illness symptoms, and TDS every Monday for 19 of 25 weeks in their season. Coaches monitored performance and rated visual signs of distress. DBS samples were analyzed for a targeted panel of 12 immune-related proteins using liquid chromatography/mass spectrometry (LC/MS). Results: Thirty-two swimmers completed the protocol. The data were grouped in 2-3 weeks segments to facilitate interpretation and analysis of the data. TDS scores varied between athletes, and were highest during the early fall conditioning ramp up period (8.9 ± 1.6 at baseline to a peak of 22.6 ± 2.0). The percent of athletes reporting illness was high throughout the season (50-78%). Analysis of TDS using Principle Component Analysis (PCA) revealed that 40.5% of the variance (PC1) could be attributed to illness prevalence, and TDS scores for the athletes reporting illness and no illness were different across the season (P < 0.001). The coaches' ratings of swim performance and swimmer's distress, sex, and racing distance (sprinters, middle distance, long distance) were not correlated with PC1. Linear Discriminant Analysis (LDA) analysis of the data showed a separation of the baseline weeks from exam weeks with or without competitions, and with competitions alone (p < 0.001). Seven of the 12 proteins monitored over the course of training were upregulated, and the addition of the protein data to LDA analysis enhanced the separation between these groups of weeks. Conclusion: TDS and illness were related in this group of 32 collegiate swimmers throughout the competitive season, and expression of immune proteins improved the statistical separation of baseline weeks from the most stressful weeks. TDS data provided by the swimmers did not match their coaches' ratings of distress and swim performance. The importance of the immune system in the reaction to internal and external stress in athletes should be an area of further research

    Proteomics-Based Detection of Immune Dysfunction in an Elite Adventure Athlete Trekking Across the Antarctica

    No full text
    Proteomics monitoring of an elite adventure athlete (age 33 years) was conducted over a 28-week period that culminated in the successful, solo, unassisted, and unsupported two month trek across the Antarctica (1500 km). Training distress was monitored weekly using a 19-item, validated training distress scale (TDS). Weekly dried blood spot (DBS) specimens were collected via fingerprick blood drops onto standard blood spot cards. DBS proteins were measured with nano-electrospray ionization liquid chromatography tandem mass spectrometry (nanoLC-MS/MS) in data-independent acquisition (DIA) mode, and 712 proteins were identified and quantified. The 28-week period was divided into time segments based on TDS scores, and a contrast analysis between weeks five and eight (low TDS) and between weeks 20 and 23 (high TDS, last month of Antarctica trek) showed that 31 proteins (n = 20 immune related) were upregulated and 35 (n = 17 immune related) were downregulated. Protein&ndash;protein interaction (PPI) networks supported a dichotomous immune response. Gene ontology (GO) biological process terms for the upregulated immune proteins showed an increase in regulation of the immune system process, especially inflammation, complement activation, and leukocyte mediated immunity. At the same time, GO terms for the downregulated immune-related proteins indicated a decrease in several aspects of the overall immune system process including neutrophil degranulation and the antimicrobial humoral response. These proteomics data support a dysfunctional immune response in an elite adventure athlete during a sustained period of mental and physical distress while trekking solo across the Antarctica

    Identification of Trans-Golgi Network Proteins in <i>Arabidopsis thaliana</i> Root Tissue

    No full text
    Knowledge of protein subcellular localization assists in the elucidation of protein function and understanding of different biological mechanisms that occur at discrete subcellular niches. Organelle-centric proteomics enables localization of thousands of proteins simultaneously. Although such techniques have successfully allowed organelle protein catalogues to be achieved, they rely on the purification or significant enrichment of the organelle of interest, which is not achievable for many organelles. Incomplete separation of organelles leads to false discoveries, with erroneous assignments. Proteomics methods that measure the distribution patterns of specific organelle markers along density gradients are able to assign proteins of unknown localization based on comigration with known organelle markers, without the need for organelle purification. These methods are greatly enhanced when coupled to sophisticated computational tools. Here we apply and compare multiple approaches to establish a high-confidence data set of <i>Arabidopsis</i> root tissue trans-Golgi network (TGN) proteins. The method employed involves immunoisolations of the TGN, coupled to probability-based organelle proteomics techniques. Specifically, the technique known as LOPIT (localization of organelle protein by isotope tagging), couples density centrifugation with quantitative mass-spectometry-based proteomics using isobaric labeling and targeted methods with semisupervised machine learning methods. We demonstrate that while the immunoisolation method gives rise to a significant data set, the approach is unable to distinguish cargo proteins and persistent contaminants from full-time residents of the TGN. The LOPIT approach, however, returns information about many subcellular niches simultaneously and the steady-state location of proteins. Importantly, therefore, it is able to dissect proteins present in more than one organelle and cargo proteins en route to other cellular destinations from proteins whose steady-state location favors the TGN. Using this approach, we present a robust list of <i>Arabidopsis</i> TGN proteins
    corecore